The compositional adjustment of amino acid substitution matrices.
نویسندگان
چکیده
Amino acid substitution matrices are central to protein-comparison methods. In most commonly used matrices, the substitution scores take a log-odds form, involving the ratio of "target" to "background" frequencies derived from large, carefully curated sets of protein alignments. However, such matrices often are used to compare protein sequences with amino acid compositions that differ markedly from the background frequencies used for the construction of the matrices. Of course, the target frequencies should be adjusted in such cases, but the lack of an appropriate way to do this has been a long-standing problem. This article shows that if one demands consistency between target and background frequencies, then a log-odds substitution matrix implies a unique set of target and background frequencies as well as a unique scale. Standard substitution matrices therefore are truly appropriate only for the comparison of proteins with standard amino acid composition. Accordingly, we present and evaluate a rationale for transforming the target frequencies implicit in a standard matrix to frequencies appropriate for a nonstandard context. This rationale yields asymmetric matrices for the comparison of proteins with divergent compositions. Earlier approaches are unable to deal with this case in a fully consistent manner. Composition-specific substitution matrix adjustment is shown to be of utility for comparing compositionally biased proteins, including those of organisms with nucleotide-biased, and therefore codon-biased, genomes or isochores.
منابع مشابه
The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions
MOTIVATION Amino acid substitution matrices play a central role in protein alignment methods. Standard log-odds matrices, such as those of the PAM and BLOSUM series, are constructed from large sets of protein alignments having implicit background amino acid frequencies. However, these matrices frequently are used to compare proteins with markedly different amino acid compositions, such as trans...
متن کاملAmino acid substitution matrices for protein conformation identification
Methods for alignment of protein sequences typically measure similarity by using substitution matrix with scores for all possible exchanges of one amino acid with another. Although widely used, the matrices derived from homologous sequence segments, such as Dayhoff’s PAM matrices and Henikoff’s BLOSUM matrices, are not specific for protein conformation identification. Using a different approach...
متن کاملPosition Dependent and Independent Evolutionary Models Based on Empirical Amino Acid Substitution Matrices
Evolutionary models measure the probability of amino acid substitutions occurring over different evolutionary distances. We examine various evolutionary models based on empirically derived amino acid substitution matrices. The models are constructed using the PAM and BLOSUM amino acid substitution matrices. We rescale these matrices by raising them to powers to model substitution patterns that ...
متن کاملNutrient compositional differentiation in the muscle of wild, inshore and offshore cage-cultured large yellow croaker (Pseudosciaena crocea)
The proximate composition, amino acid and fatty acid composition in the muscle of wild, inshore and offshore cage-cultured large yellow croaker (Pseudosciaena crocea) were determined to identify nutritional differences. Wild fish groups showed highest content of moisture and crude protein, but lowest lipid content. Offshore cage-cultured fish showed significantly higher content of moisture and ...
متن کاملAmino Acid Substitution Matrices Estimated by Maximum Likelihood
The present work describes protrates, a program that estimates amino acid substitution matrices and among-site substitution rates based on their likelihood for a given tree topology and a dataset of aligned proteins. The issue of producing maximum likelihood (ML) rate matrices over protein data have been adressed under the framework of general-purpose unbiased substitution matrices [1, 9], sinc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 100 26 شماره
صفحات -
تاریخ انتشار 2003